skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Purohit, Sumit"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Cyberattacks on power grids pose significant risks to national security. Power grid attacks typically lead to abnormal readings in power output, frequency, current, and voltage. Due to the interconnected structure of power grids, abnormalities can spread throughout the system and cause widespread power outages if not detected and dealt with promptly. Our research proposes a novel anomaly detection system for power grids that prevents overfitting. We created a network graph to represent the structure of the power grid, where nodes represent power grid components like generators and edges represent connections between nodes such as overhead power lines. We combine the capabilities of Long Short-Term Memory (LSTM) models with a Graph Isomorphism Network (GIN) in a hybrid model to pinpoint anomalies in the grid. We train our model on each category of nodes that serves a similar structural purpose to prevent overfitting of the model. We then assign each node in the graph a unique signature using a GIN. Our model achieved a 99.92% accuracy rate, which is significantly higher than a version of our model without structural encoding, which had an accuracy level of 97.30%. Our model allows us to capture structural and temporal components of power grids and develop an attack detection system with high accuracy without overfitting. 
    more » « less
  2. Coronavirus Disease 2019 (Covid-19) is an ongoing outbreak and the latest threat to global health. It is imperative to understand the implications of social interaction on Covid-19 indicators in order to help formulate policies and guidelines by governments and local authorities. We present a case-study of curating state-level Covid-19 indicators such as Active Cases, Deaths, Hospitalization Rate, etc. for the United States. We also curate open source domestic US air travel data and present its impact on Covid-19 indicators. We perform a time-series analysis of the dataset using Independent Temporal Motif (ITeM) to find weekly trends in the data. We publish the dataset and the results for further exploration by the research community. 
    more » « less
  3. Graph mining is an important data analysis methodology, but struggles as the input graph size increases. The scalability and usability challenges posed by such large graphs make it imperative to sample the input graph and reduce its size. The critical challenge in sampling is to identify the appropriate algorithm to insure the resulting analysis does not suffer heavily from the data reduction. Predicting the expected performance degradation for a given graph and sampling algorithm is also useful. In this paper, we present different sampling approaches for graph mining applications such as Frequent Subgrpah Mining (FSM), and Community Detection (CD). We explore graph metrics such as PageRank, Triangles, and Diversity to sample a graph and conclude that for heterogeneous graphs Triangles and Diversity perform better than degree based metrics. We also present two new sampling variations for targeted graph mining applications. We present empirical results to show that knowledge of the target application, along with input graph properties can be used to select the best sampling algorithm. We also conclude that performance degradation is an abrupt, rather than gradual phenomena, as the sample size decreases. We present the empirical results to show that the performance degradation follows a logistic function. 
    more » « less